Introduction to Computational Biology Lecture # 3: Estimating Scoring Rules for Sequence Alignment

نویسنده

  • Yehonatan Sela
چکیده

2.1 Two different approaches It is possible to create a scoring matrix by a calculated selection of criteria according to any arbitrary set of biological constraints. Yet, we must realize that there are countless constraints to keep in mind and once we have generated this matrix according to a chosen set of criteria we hardly have any assurances as to its success in estimating the alignment score. We would rather create the matrix in accordance with some methodology that will give some indication to its success in estimating the likelihood of an alignment. For this we use a training set of ”real” alignments. Two models exist: 1. Generative method Modelize the way in which the generated data of the frequencies in the training set is translated to a score. 2. Discriminative method Choose a score that prefers the training alignments from alternatives. Today, as well as along the course, we will focus our discussion on the first approach.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Computational Biology Lecture 18: Genome rearrangements, finding maximal matches

One possibility is to perform a global alignment of the two strings x and y with a special scoring sheme; for instance, +1 for a match, 0 for a mismatch, and 0 for a gap. Then we could identify all the maximal positively scoring chunks of the alignment. The disadvantages of this approach is that it requires O(mn) running time, might not obtain all candidate matches, and obtains matches that are...

متن کامل

gpALIGNER: A Fast Algorithm for Global Pairwise Alignment of DNA Sequences

Bioinformatics, through the sequencing of the full genomes for many species, is increasingly relying on efficient global alignment tools exhibiting both high sensitivity and specificity. Many computational algorithms have been applied for solving the sequence alignment problem. Dynamic programming, statistical methods, approximation and heuristic algorithms are the most common methods appli...

متن کامل

Accuracy Estimation and Parameter Advising for Protein Multiple Sequence Alignment

Abstract We develop a novel and general approach to estimating the accuracy of multiple sequence alignments without knowledge of a reference alignment, and use our approach to address a new task that we call parameter advising: the problem of choosing values for alignment scoring function parameters from a given set of choices to maximize the accuracy of a computed alignment. For protein alignm...

متن کامل

Sequence Alignment as Hypothesis Testing

Sequence alignment depends on the scoring function that defines similarity between pairs of letters. For local alignment, the computational algorithm searches for the most similar segments in the sequences according to the scoring function. The choice of this scoring function is important for correctly detecting segments of interest. We formulate sequence alignment as a hypothesis testing probl...

متن کامل

Computational Biology Lecture 11: Pairwise alignment using HMMs

We looked at various alignment algorithms with different scoring schemes. We argued that the score of an alignment is related to the relative likelihood that the two sequences are related compared to being unreleated, and we used the log-odds ratio to express this relative likelihood while maintaining an additive scoring scheme. Therefore, maximizing the score of an alignment was in some sense ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008